cabal install
, or a fun VSCode extension gets updated, or anything like that, I am running code that could be malicious or buggy.
In a way it is surprising and reassuring that, as far as I can tell, this commonly does not happen. Most open source developers out there seem to be nice and well-meaning, after all.
git push
, git pull
from private repositories, gh pr create
) from the outside , and the actual build environment can do without access to these secrets.
The user experience I thus want is a quick way to enter a development environment where I can do most of the things I need to do while programming (network access, running command line and GUI programs), with access to the current project, but without access to my actual /home
directory.
I initially followed the blog post Application Isolation using NixOS Containers by Marcin Sucharski and got something working that mostly did what I wanted, but then a colleague pointed out that tools like firejail
can achieve roughly the same with a less global setup. I tried to use firejail
, but found it to be a bit too inflexible for my particular whims, so I ended up writing a small wrapper around the lower level sandboxing tool https://github.com/containers/bubblewrap.
dev
and included below, builds a new filesystem namespace with minimal /proc
and /dev
directories, it s own /tmp
directories. It then binds-mound some directories to make the host s NixOS system available inside the container (/bin
, /usr
, the nix store including domain socket, stuff for OpenGL applications). My user s home directory is taken from ~/.dev-home
and some configuration files are bind-mounted for convenient sharing. I intentionally don t share most of the configuration for example, a direnv enable
in the dev environment should not affect the main environment. The X11 socket for graphical applications and the corresponding .Xauthority
file is made available. And finally, if I run dev
in a project directory, this project directory is bind mounted writable, and the current working directory is preserved.
The effect is that I can type dev
on the command line to enter dev mode rather conveniently. I can run development tools, including graphical ones like VSCode, and especially the latter with its extensions is part of the sandbox. To do a git push
I either exit the development environment (Ctrl-D) or open a separate terminal. Overall, the inconvenience of switching back and forth seems worth the extra protection.
Clearly, isn t going to hold against a determined and maybe targeted attacker (e.g. access to the X11 and the nix daemon socket can probably be used to escape easily). But I hope it will help against a compromised dev dependency that just deletes or exfiltrates data, like keys or passwords, from the usual places in $HOME
.
xdg-desktop-portal
to heed my default browser settings ). For now I will live with manually copying and pasting URLs, we ll see how long this lasts.evolution
or firefox
inside the container, and if I do not even have VSCode or cabal
available outside, so that it s less likely that I forget to enter dev
before using these tools.
It shouldn t be too hard to cargo-cult some of the NixOS Containers infrastructure to be able to have a separate system configuration that I can manage as part of my normal system configuration and make available to bubblewrap
here.dev
and going back to what I did before
dev
script (at the time of writing)
#!/usr/bin/env bash
extra=()
if [[ "$PWD" == /home/jojo/build/* ]] [[ "$PWD" == /home/jojo/projekte/programming/* ]]
then
extra+=(--bind "$PWD" "$PWD" --chdir "$PWD")
fi
if [ -n "$1" ]
then
cmd=( "$@" )
else
cmd=( bash )
fi
# Caveats:
# * access to all of /etc
# * access to /nix/var/nix/daemon-socket/socket , and is trusted user (but needed to run nix)
# * access to X11
exec bwrap \
--unshare-all \
\
# blank slate \
--share-net \
--proc /proc \
--dev /dev \
--tmpfs /tmp \
--tmpfs /run/user/1000 \
\
# Needed for GLX applications, in paticular alacritty \
--dev-bind /dev/dri /dev/dri \
--ro-bind /sys/dev/char /sys/dev/char \
--ro-bind /sys/devices/pci0000:00 /sys/devices/pci0000:00 \
--ro-bind /run/opengl-driver /run/opengl-driver \
\
--ro-bind /bin /bin \
--ro-bind /usr /usr \
--ro-bind /run/current-system /run/current-system \
--ro-bind /nix /nix \
--ro-bind /etc /etc \
--ro-bind /run/systemd/resolve/stub-resolv.conf /run/systemd/resolve/stub-resolv.conf \
\
--bind ~/.dev-home /home/jojo \
--ro-bind ~/.config/alacritty ~/.config/alacritty \
--ro-bind ~/.config/nvim ~/.config/nvim \
--ro-bind ~/.local/share/nvim ~/.local/share/nvim \
--ro-bind ~/.bin ~/.bin \
\
--bind /tmp/.X11-unix/X0 /tmp/.X11-unix/X0 \
--bind ~/.Xauthority ~/.Xauthority \
--setenv DISPLAY :0 \
\
--setenv container dev \
"$ extra[@] " \
-- \
"$ cmd[@] "
Package maintainers can guarantee package authorship through software signing [but] it is unclear how common this practice is, and whether the resulting signatures are created properly. Prior work has provided raw data on signing practices, but measured single platforms, did not consider time, and did not provide insight on factors that may influence signing. We lack a comprehensive, multi-platform understanding of signing adoption and relevant factors. This study addresses this gap. (arXiv, full PDF)
[The] principle of reusability [ ] makes it harder to reproduce projects build environments, even though reproducibility of build environments is essential for collaboration, maintenance and component lifetime. In this work, we argue that functional package managers provide the tooling to make build environments reproducible in space and time, and we produce a preliminary evaluation to justify this claim.The abstract continues with the claim that Using historical data, we show that we are able to reproduce build environments of about 7 million Nix packages, and to rebuild 99.94% of the 14 thousand packages from a 6-year-old Nixpkgs revision. (arXiv, full PDF)
This paper thus proposes an approach to automatically identify configuration options causing non-reproducibility of builds. It begins by building a set of builds in order to detect non-reproducible ones through binary comparison. We then develop automated techniques that combine statistical learning with symbolic reasoning to analyze over 20,000 configuration options. Our methods are designed to both detect options causing non-reproducibility, and remedy non-reproducible configurations, two tasks that are challenging and costly to perform manually. (HAL Portal, full PDF)
fedora-repro-build
that attempts to reproduce an existing package within a koji build environment. Although the projects README
file lists a number of fields will always or almost always vary and there is a non-zero list of other known issues, this is an excellent first step towards full Fedora reproducibility.
256
, 257
and 258
to Debian and made the following additional changes:
gpg
s use-embedded-filenames. Many thanks to Daniel Kahn Gillmor dkg@debian.org for reporting this issue and providing feedback. [ ][ ]struct.unpack
-related errors when parsing Python .pyc
files. (#1064973). [ ]rdb_expected_diff
on non-GNU systems as %p
formatting can vary, especially with respect to MacOS. [ ]pytest
8.0. [ ]7zip
package (over p7zip-full
) after a Debian package transition. (#1063559). [ ]test_zip
black clean. [ ]diff(1)
correctly [ ][ ] thanks! And lastly, Vagrant Cascadian pushed updates in GNU Guix for diffoscope to version 255, 256, and 258, and updated trydiffoscope to 67.0.6.
README.rst
to match. [ ][ ]--vary=build_path.path
option. [ ][ ][ ][ ]SOURCE_DATE_EPOCH
page. [ ]SOURCE_DATE_EPOCH
documentation re. datetime.datetime.fromtimestamp
. Thanks, James Addison. [ ]/usr/bin/du --apparent-size
in the Jenkins shell monitor. [ ]arm64
nodes. [ ]/proc/$pid/oom_score_adj
to -1000 if it has not already been done. [ ]opemwrt-target-tegra
and jtx
task to the list of zombie jobs. [ ][ ]armhf
architecture build nodes, virt32z
and virt64z
, and insert them into the Munin monitoring. [ ][ ] [ ][ ]tegra
target with mpc85xx
[ ], Jan-Benedict Glaw updated the NetBSD build script to use a separate $TMPDIR
to mitigate out of space issues on a tmpfs-backed /tmp
[ ] and Zheng Junjie added a link to the GNU Guix tests [ ].
Lastly, node maintenance was performed by Holger Levsen [ ][ ][ ][ ][ ][ ] and Vagrant Cascadian [ ][ ][ ][ ].
gimagereader
(date)grass
(date-related issue)grub2
(filesystem ordering issue)latex2html
(drop a non-deterministic log)mhvtl
(tar)obs
(build-tool issue)ollama
(GZip embedding the modification time)presenterm
(filesystem-ordering issue)qt6-quick3d
(parallelism)flask-limiter
.python-parsl-doc
(disable dynamic argument evaluation by Sphinx autodoc
extension)python3-pytest-repeat
(remove entry_points.txt
creation that varied by shell)python3-selinux
(remove packaged direct_url.json
file that embeds build path)python3-sepolicy
(remove packaged direct_url.json
file that embeds build path)pyswarms
.python-x2go
.snapd
(fix timestamp header in packaged manual-page)zzzeeksphinx
(existing RB patch forwarded and merged (with modifications))#reproducible-builds
on irc.oftc.net
.
rb-general@lists.reproducible-builds.org
This post is a review for Computing Reviews for Constructed truths truth and knowledge in a post-truth world , a book published in Springer LinkMany of us grew up used to having some news sources we could implicitly trust, such as well-positioned newspapers and radio or TV news programs. We knew they would only hire responsible journalists rather than risk diluting public trust and losing their brand s value. However, with the advent of the Internet and social media, we are witnessing what has been termed the post-truth phenomenon. The undeniable freedom that horizontal communication has given us automatically brings with it the emergence of filter bubbles and echo chambers, and truth seems to become a group belief. Contrary to my original expectations, the core topic of the book is not about how current-day media brings about post-truth mindsets. Instead it goes into a much deeper philosophical debate: What is truth? Does truth exist by itself, objectively, or is it a social construct? If activists with different political leanings debate a given subject, is it even possible for them to understand the same points for debate, or do they truly experience parallel realities? The author wrote this book clearly prompted by the unprecedented events that took place in 2020, as the COVID-19 crisis forced humanity into isolation and online communication. Donald Trump is explicitly and repeatedly presented throughout the book as an example of an actor that took advantage of the distortions caused by post-truth. The first chapter frames the narrative from the perspective of information flow over the last several decades, on how the emergence of horizontal, uncensored communication free of editorial oversight started empowering the netizens and created a temporary information flow utopia. But soon afterwards, algorithmic gatekeepers started appearing, creating a set of personalized distortions on reality; users started getting news aligned to what they already showed interest in. This led to an increase in polarization and the growth of narrative-framing-specific communities that served as echo chambers for disjoint views on reality. This led to the growth of conspiracy theories and, necessarily, to the science denial and pseudoscience that reached unimaginable peaks during the COVID-19 crisis. Finally, when readers decide based on completely subjective criteria whether a scientific theory such as global warming is true or propaganda, or question what most traditional news outlets present as facts, we face the phenomenon known as fake news. Fake news leads to post-truth, a state where it is impossible to distinguish between truth and falsehood, and serves only a rhetorical function, making rational discourse impossible. Toward the end of the first chapter, the tone of writing quickly turns away from describing developments in the spread of news and facts over the last decades and quickly goes deep into philosophy, into the very thorny subject pursued by said discipline for millennia: How can truth be defined? Can different perspectives bring about different truth values for any given idea? Does truth depend on the observer, on their knowledge of facts, on their moral compass or in their honest opinions? Zoglauer dives into epistemology, following various thinkers ideas on what can be understood as truth: constructivism (whether knowledge and truth values can be learnt by an individual building from their personal experience), objectivity (whether experiences, and thus truth, are universal, or whether they are naturally individual), and whether we can proclaim something to be true when it corresponds to reality. For the final chapter, he dives into the role information and knowledge play in assigning and understanding truth value, as well as the value of second-hand knowledge: Do we really own knowledge because we can look up facts online (even if we carefully check the sources)? Can I, without any medical training, diagnose a sickness and treatment by honestly and carefully looking up its symptoms in medical databases? Wrapping up, while I very much enjoyed reading this book, I must confess it is completely different from what I expected. This book digs much more into the abstract than into information flow in modern society, or the impact on early 2020s politics as its editorial description suggests. At 160 pages, the book is not a heavy read, and Zoglauer s writing style is easy to follow, even across the potentially very deep topics it presents. Its main readership is not necessarily computing practitioners or academics. However, for people trying to better understand epistemology through its expressions in the modern world, it will be a very worthy read.
This post is a review for Computing Reviews for 10 things software developers should learn about learning , a article published in Communications of the ACMAs software developers, we understand the detailed workings of the different components of our computer systems. And probably due to how computers were presented since their appearance as digital brains in the 1940s we sometimes believe we can transpose that knowledge to how our biological brains work, be it as learners or as problem solvers. This article aims at making the reader understand several mechanisms related to how learning and problem solving actually work in our brains. It focuses on helping expert developers convey knowledge to new learners, as well as learners who need to get up to speed and start coding. The article s narrative revolves around software developers, but much of what it presents can be applied to different problem domains. The article takes this mission through ten points, with roughly the same space given to each of them, starting with wrong assumptions many people have about the similarities between computers and our brains. The first section, Human Memory Is Not Made of Bits, explains the brain processes of remembering as a way of strengthening the force of a memory ( reconsolidation ) and the role of activation in related network pathways. The second section, Human Memory Is Composed of One Limited and One Unlimited System, goes on to explain the organization of memories in the brain between long-term memory (functionally limitless, permanent storage) and working memory (storing little amounts of information used for solving a problem at hand). However, the focus soon shifts to how experience in knowledge leads to different ways of using the same concepts, the importance of going from abstract to concrete knowledge applications and back, and the role of skills repetition over time. Toward the end of the article, the focus shifts from the mechanical act of learning to expertise. Section 6, The Internet Has Not Made Learning Obsolete, emphasizes that problem solving is not just putting together the pieces of a puzzle; searching online for solutions to a problem does not activate the neural pathways that would get fired up otherwise. The final sections tackle the differences that expertise brings to play when teaching or training a newcomer: the same tools that help the beginner s productivity as training wheels will often hamper the expert user s as their knowledge has become automated. The article is written with a very informal and easy-to-read tone and vocabulary, and brings forward several issues that might seem like commonsense but do ring bells when it comes to my own experiences both as a software developer and as a teacher. The article closes by suggesting several books that further expand on the issues it brings forward. While I could not identify a single focus or thesis with which to characterize this article, the several points it makes will likely help readers better understand (and bring forward to consciousness) mental processes often taken for granted, and consider often-overlooked aspects when transmitting knowledge to newcomers.
queer.af
domain registration by the Taliban, the fragile and difficult nature of country-code top-level domains (ccTLDs) has once again been comprehensively demonstrated.
Since many people may not be aware of the risks, I thought I d give a solid explainer of the whole situation, and explain why you should, in general, not have anything to do with domains which are registered under ccTLDs.
https://
in your web browser s location bar).
It s the com in example.com
, or the af in queer.af
.
There are two kinds of TLDs: country-code TLDs (ccTLDs) and generic TLDs (gTLDs).
Despite all being TLDs, they re very different beasts under the hood.
queer.af
cancellation is interesting because, at the time the domain was reportedly registered, 2018, Afghanistan had what one might describe as, at least, a different political climate.
Since then, of course, things have changed, and the new bosses have decided to get a bit more active.
Those running queer.af
seem to have seen the writing on the wall, and were planning on moving to another, less fraught, domain, but hadn t completed that move when the Taliban came knocking.
.eu
, you have to be a resident of the EU.
When the UK ceased to be part of the EU, residents of the UK were no longer EU residents.
Cue much unhappiness, wailing, and gnashing of teeth when this was pointed out to Britons.
Some decided to give up their domains, and move to other parts of the Internet, while others managed to hold onto them by various legal sleight-of-hand (like having an EU company maintain the registration on their behalf).
In any event, all very unpleasant for everyone involved.
.sc
domain names from US$25 to US$75. No reason, no warning, just pay up .
.ly
.
These domain registrations weren t (and aren t) cheap, and it s hard to imagine that at least some of that money wasn t going to benefit the Gaddafi regime.
Similarly, the British Indian Ocean Territory, which has the io ccTLD, was created in a colonialist piece of chicanery that expelled thousands of native Chagossians from Diego Garcia.
Money from the registration of .io
domains doesn t go to the (former) residents of the Chagos islands, instead it gets paid to the UK government.
Again, I m not trying to suggest that all gTLD operators are wonderful people, but it s not particularly likely that the direct beneficiaries of the operation of a gTLD stole an island chain and evicted the residents.
.au
namespace some years ago.
Essentially, while a ccTLD may have geographic connotations now, there s not a lot of guarantee that they won t fall victim to scope creep in the future.
Finally, it might be somewhat safer to register under a ccTLD if you live in the location involved.
At least then you might have a better idea of whether your domain is likely to get pulled out from underneath you.
Unfortunately, as the .eu
example shows, living somewhere today is no guarantee you ll still be living there tomorrow, even if you don t move house.
In short, I d suggest sticking to gTLDs.
They re at least lower risk than ccTLDs.
[Service] # innd will change the permissions of /run/news/ when started: without # creating it now with mode 0775 then that will change the ACL mask. RuntimeDirectoryMode=0775 # allow user md to run ctlinnd(8), which creates sockets in /run/news/ ExecStartPost=/usr/bin/setfacl --modify user:md:rwx $RUNTIME_DIRECTORYThe non-obvious issue here is that the innd daemon on startup will change the directory permissions in a way which sets a more restrictive (non group-writeable) ACL mask, and this would make the newly created user ACL ineffective. The solution is to create the directory group-writeable from start. (Beware: this creates a trivial privileges escalation from md to news.)
Our exploit path resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to [Amazon Web Services], potentially add code to the main repository branch, backdoor PyTorch dependencies the list goes on. In short, it was bad. Quite bad.The attack pivoted on PyTorch s use of self-hosted runners as well as submitting a pull request to address a trivial typo in the project s
README
file to gain access to repository secrets and API keys that could subsequently be used for malicious purposes.
archlinux-userland-fs-cmp
, the tool is supposed to be used from a rescue image (any Linux) with an Arch install mounted to, [for example], /mnt
. Crucially, however, at no point is any file from the mounted filesystem eval d or otherwise executed. Parsers are written in a memory safe language.
More information about the tool can be found on their announcement message, as well as on the tool s homepage. A GIF of the tool in action is also available.
SOURCE_DATE_EPOCH
code?
Chris Lamb started a thread on our mailing list summarising some potential problems with the source code snippet the Reproducible Builds project has been using to parse the SOURCE_DATE_EPOCH
environment variable:
I m not 100% sure who originally wrote this code, but it was probably sometime in the ~2015 era, and it must be in a huge number of codebases by now. Anyway, Alejandro Colomar was working on the shadow security tool and pinged me regarding some potential issues with the code. You can see this conversation here.Chris ends his message with a request that those with intimate or low-level knowledge of
time_t
, C types, overflows and the various parsing libraries in the C standard library (etc.) contribute with further info.
SOURCE_DATE_EPOCH
to document its interaction with distribution rebuilds. [ ].
254
and 255
to Debian but focusing on triaging and/or merging code from other contributors. This included adding support for comparing eXtensible ARchive (.XAR/.PKG) files courtesy of Seth Michael Larson [ ][ ], as well considerable work from Vekhir in order to fix compatibility between various and subtle incompatible versions of the progressbar libraries in Python [ ][ ][ ][ ]. Thanks!
arm64
architecture workers from 24 to 16. [ ]arm64
nodes when they hit an OOM (out of memory) state. [ ]real_year
variable to 2024 [ ] and bump various copyright years as well [ ].iptables
tool everywhere, else our custom rc.local
script fails. [ ]/srv/workspace/pbuilder
directory on boot. [ ]chroot-installation
jobs to a maximum of 4 concurrent runs. [ ][ ]armhf
architecture test infrastructure. This provided the incentive to replace the UPS batteries and consolidate infrastructure to reduce future UPS load. [ ]
Elsewhere in our infrastructure, however, Holger Levsen also adjusted the email configuration for @reproducible-builds.org
to deal with a new SMTP email attack. [ ]
cython
(nondeterminstic path issue)deluge
(issue with modification time of .egg
file)gap-ferret
, gap-semigroups
& gap-simpcomp
(nondeterministic config.log
file)grpc
(filesystem ordering issue )hub
(random)kubernetes1.22
& kubernetes1.23
(sort-related issue)kubernetes1.24
& kubernetes1.25
(go -trimpath
vs random issue)libjcat
(drop test files with random bytes)luajit
(Use new d
option for deterministic bytecode output)meson
[ ][ ] (sort the results from Python filesystem call)python-rjsmin
(drop GCC instrumentation artifacts)qt6-virtualkeyboard+others
(bug parallelism/race)SoapySDR
(parallelism-related issue)systemd
(sorting problem)warewulf
(CPIO modification time issue, etc.)guake
( Schroedinger file due to race condition)qhelpgenerator-qt5
(timezone localization; fix also merged upstream for QT6)sphinx
(search index doctitle
sorting)mm-common
package in Debian this was quickly fixed, however. [ ]
#reproducible-builds
on irc.oftc.net
.
rb-general@lists.reproducible-builds.org
The clearance was entered initially with estimated import charges of 400.03, consisting of 387.83 VAT, and 12.20 disbursement fee. This original entry regrettably did not include the freight cost for calculating the VAT, and as such when submitted for final entry the VAT value was adjusted to include this and an amended invoice was issued for an additional 39.84. HMRC calculate the amount against which VAT is raised using the value of goods, insurance and freight, however they also may apply a VAT adjustment figure. The VAT Adjustment is based on many factors (Incidental costs in regards to a shipment), which includes charge for currency conversion if the invoice does not list values in Sterling, but the main is due to the inland freight from airport of destination to the final delivery point, as this charge varies, for example, from EMA to Edinburgh would be 150, from EMA to Derby would be 1, so each year UPS must supply HMRC with all values incurred for entry build up and they give an average which UPS have to use on the entry build up as the VAT Adjustment. The correct calculation for the import charges is therefore as follows: Goods value divided by exchange rate 2,489.53 EUR / 1.1683 = 2,130.89 GBP Duty: Goods value plus freight (%) 2,130.89 GBP + 5% = 2,237.43 GBP. That total times the duty rate. X 0 % = 0 GBP VAT: Goods value plus freight (100%) 2,130.89 GBP + 0 = 2,130.89 GBP That total plus duty and VAT adjustment 2,130.89 GBP + 0 GBP + 7.49 GBP = 2,348.08 GBP. That total times 20% VAT = 427.67 GBP As detailed above we must confirm that the final VAT charges applied to the shipment were correct, and that no refund of this is therefore due.This looks very like HMRC-originated nonsense. If only they had put it on the original bills! It s completely ridiculous that it took four months and near-litigation to obtain it. Disbursement fee One more thing. UPS billed me a 12 disbursement fee . When you import something, there s often tax to pay. The courier company pays that to the government, and the consignee pays it to the courier. Usually the courier demands it before final delivery, since otherwise they end up having to chase it as a debt. It is common for parcel companies to add a random fee of their own. As I note in my Particulars, there isn t any legal basis for this. In my own offer of settlement I proposed that UPS should:
State under what principle of English law (such as, what enactment or principle of Common Law), you levy the disbursement fee (or refund it).To my surprise they actually responded to this in their own settlement letter. (They didn t, for example, mention the harassment at all.) They said (emphasis mine):
A disbursement fee is a fee for amounts paid or processed on behalf of a client. It is an established category of charge used by legal firms, amongst other companies, for billing of various ancillary costs which may be incurred in completion of service. Disbursement fees are not covered by a specific law, nor are they legally prohibited. Regarding UPS disbursement fee this is an administrative charge levied for the use of UPS deferment account to prepay import charges for clearance through CDS. This charge would therefore be billed to the party that is responsible for the import charges, normally the consignee or receiver of the shipment in question. The disbursement fee as applied is legitimate, and as you have stated is a commonly used and recognised charge throughout the courier industry, and I can confirm that this was charged correctly in this instance.On UPS s analysis, they can just make up whatever fee they like. That is clearly not right (and I don t even need to refer to consumer protection law, which would also make it obviously unlawful). And, that everyone does it doesn t make it lawful. There are so many things that are ubiquitous but unlawful, especially nowadays when much of the legal system - especially consumer protection regulators - has been underfunded to beyond the point of collapse. Next time this comes up I might have a go at getting the fee back. (Obviously I ll have to pay it first, to get my parcel.) ParcelForce and Royal Mail I think this analysis doesn t apply to ParcelForce and (probably) Royal Mail. I looked into this in 2009, and I found that Parcelforce had been given the ability to write their own private laws: Schemes made under section 89 of the Postal Services Act 2000. This is obviously ridiculous but I think it was the law in 2009. I doubt the intervening governments have fixed it. Furniture Oh, yes, the actual furniture. The replacements arrived intact and are great :-).
/C=BE/O=GlobalSign nv-sa/CN=AlphaSSL CA - SHA256 - G4 /C=GB/ST=Greater Manchester/L=Salford/O=Sectigo Limited/CN=Sectigo RSA Domain Validation Secure Server CA /C=GB/ST=Greater Manchester/L=Salford/O=Sectigo Limited/CN=Sectigo RSA Organization Validation Secure Server CA /C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU=http://certs.godaddy.com/repository//CN=Go Daddy Secure Certificate Authority - G2 /C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./OU=http://certs.starfieldtech.com/repository//CN=Starfield Secure Certificate Authority - G2 /C=AT/O=ZeroSSL/CN=ZeroSSL RSA Domain Secure Site CA /C=BE/O=GlobalSign nv-sa/CN=GlobalSign GCC R3 DV TLS CA 2020Rather than try to work with raw issuers (because, as Andrew Ayer says, The SSL Certificate Issuer Field is a Lie), I mapped these issuers to the organisations that manage them, and summed the counts for those grouped issuers together.
Issuer | Compromised Count |
---|---|
Sectigo | 170 |
ISRG (Let's Encrypt) | 161 |
GoDaddy | 141 |
DigiCert | 81 |
GlobalSign | 46 |
Entrust | 3 |
SSL.com | 1 |
Issuer | Issuance Volume | Compromised Count | Compromise Rate |
---|---|---|---|
Sectigo | 88,323,068 | 170 | 1 in 519,547 |
ISRG (Let's Encrypt) | 315,476,402 | 161 | 1 in 1,959,480 |
GoDaddy | 56,121,429 | 141 | 1 in 398,024 |
DigiCert | 144,713,475 | 81 | 1 in 1,786,586 |
GlobalSign | 1,438,485 | 46 | 1 in 31,271 |
Entrust | 23,166 | 3 | 1 in 7,722 |
SSL.com | 171,816 | 1 | 1 in 171,816 |
Issuer | Issuance Volume | Compromised Count | Compromise Rate |
---|---|---|---|
Entrust | 23,166 | 3 | 1 in 7,722 |
GlobalSign | 1,438,485 | 46 | 1 in 31,271 |
SSL.com | 171,816 | 1 | 1 in 171,816 |
GoDaddy | 56,121,429 | 141 | 1 in 398,024 |
Sectigo | 88,323,068 | 170 | 1 in 519,547 |
DigiCert | 144,713,475 | 81 | 1 in 1,786,586 |
ISRG (Let's Encrypt) | 315,476,402 | 161 | 1 in 1,959,480 |
SELECT SUM(sub.NUM_ISSUED[2] - sub.NUM_EXPIRED[2]) FROM ( SELECT ca.name, max(coalesce(coalesce(nullif(trim(cc.SUBORDINATE_CA_OWNER), ''), nullif(trim(cc.CA_OWNER), '')), cc.INCLUDED_CERTIFICATE_OWNER)) as OWNER, ca.NUM_ISSUED, ca.NUM_EXPIRED FROM ccadb_certificate cc, ca_certificate cac, ca WHERE cc.CERTIFICATE_ID = cac.CERTIFICATE_ID AND cac.CA_ID = ca.ID GROUP BY ca.ID ) sub WHERE sub.name ILIKE '%Amazon%' OR sub.name ILIKE '%CloudFlare%' AND sub.owner = 'DigiCert';The number I get from running that query is 104,316,112, which should be subtracted from DigiCert s total issuance figures to get a more accurate view of what DigiCert s regular customers do with their private keys. When I do this, the compromise rates table, sorted by the compromise rate, looks like this:
Issuer | Issuance Volume | Compromised Count | Compromise Rate |
---|---|---|---|
Entrust | 23,166 | 3 | 1 in 7,722 |
GlobalSign | 1,438,485 | 46 | 1 in 31,271 |
SSL.com | 171,816 | 1 | 1 in 171,816 |
GoDaddy | 56,121,429 | 141 | 1 in 398,024 |
"Regular" DigiCert | 40,397,363 | 81 | 1 in 498,732 |
Sectigo | 88,323,068 | 170 | 1 in 519,547 |
All DigiCert | 144,713,475 | 81 | 1 in 1,786,586 |
ISRG (Let's Encrypt) | 315,476,402 | 161 | 1 in 1,959,480 |
The less humans have to do with certificate issuance, the less likely they are to compromise that certificate by exposing the private key. While it may not be surprising, it is nice to have some empirical evidence to back up the common wisdom. Fully-managed TLS providers, such as CloudFlare, AWS Certificate Manager, and whatever Azure s thing is called, is the platonic ideal of this principle: never give humans any opportunity to expose a private key. I m not saying you should use one of these providers, but the security approach they have adopted appears to be the optimal one, and should be emulated universally. The ACME protocol is the next best, in that there are a variety of standardised tools widely available that allow humans to take themselves out of the loop, but it s still possible for humans to handle (and mistakenly expose) key material if they try hard enough. Legacy issuance methods, which either cannot be automated, or require custom, per-provider automation to be developed, appear to be at least four times less helpful to the goal of avoiding compromise of the private key associated with a certificate.
So these "simple" files have way too many combinations of how they can be interpreted. I figured it would be helpful if debputy could highlight these difference, so I added support for those as well. Accordingly, debian/install is tagged with multiple tags including dh-executable-config and dh-glob-after-execute. Then, I added a datatable of these tags, so it would be easy for people to look up what they meant. Ok, this seems like a closed deal, right...?
- Will the debhelper use filearray, filedoublearray or none of them to read the file? This topic has about 2 bits of entropy.
- Will the config file be executed if it is marked executable assuming you are using the right compat level? If it is executable, does dh-exec allow renaming for this file? This topic adds 1 or 2 bit of entropy depending on the context.
- Will the config file be subject to glob expansions? This topic sounds like a boolean but is a complicated mess. The globs can be handled either by debhelper as it parses the file for you. In this case, the globs are applied to every token. However, this is not what dh_install does. Here the last token on each line is supposed to be a directory and therefore not subject to globs. Therefore, dh_install does the globbing itself afterwards but only on part of the tokens. So that is about 2 bits of entropy more. Actually, it gets worse...
- If the file is executed, debhelper will refuse to expand globs in the output of the command, which was a deliberate design choice by the original debhelper maintainer took when he introduced the feature in debhelper/8.9.12. Except, dh_install feature interacts with the design choice and does enable glob expansion in the tool output, because it does so manually after its filedoublearray call.
You can help yourself and others to better results by using the declarative way rather than using debian/rules, which is the bane of all introspection!
- When determining which commands are relevant, using Build-Depends: dh-sequence-foo is much more reliable than configuring it via the Turing complete configuration we call debian/rules.
- When debhelper commands use NOOP promise hints, dh_assistant can "see" the config files listed those hints, meaning the file will at least be detected. For new introspectable hint and the debputy plugin, it is probably better to wait until the dust settles a bit before adding any of those.